Goto

Collaborating Authors

 sample location


Physics-informed Blind Reconstruction of Dense Fields from Sparse Measurements using Neural Networks with a Differentiable Simulator

Aloni, Ofek, Fishbain, Barak

arXiv.org Machine Learning

Generating dense physical fields from sparse measurements is a fundamental question in sampling, signal processing, and many other applications. State-of-the-art methods either use spatial statistics or rely on examples of dense fields in the training phase, which often are not available, and thus rely on synthetic data. Here, we present a reconstruction method that generates dense fields from sparse measurements, without assuming availability of the spatial statistics, nor of examples of the dense fields. This is made possible through the introduction of an automatically differentiable numerical simulator into the training phase of the method. The method is shown to have superior results over statistical and neural network based methods on a set of three standard problems from fluid mechanics.


Towards Autonomous In-situ Soil Sampling and Mapping in Large-Scale Agricultural Environments

Nguyen, Thien Hoang, Muller, Erik, Rubin, Michael, Wang, Xiaofei, Sibona, Fiorella, McBratney, Alex, Sukkarieh, Salah

arXiv.org Artificial Intelligence

Abstract-- Traditional soil sampling and analysis methods are labor-intensive, time-consuming, and limited in spatial resolution, making them unsuitable for large-scale precision agriculture. T o address these limitations, we present a robotic solution for real-time sampling, analysis and mapping of key soil properties. Our system consists of two main sub-systems: a Sample Acquisition System (SAS) for precise, automated in-field soil sampling; and a Sample Analysis Lab (Lab) for real-time soil property analysis. The system's performance was validated through extensive field trials at a large-scale Australian farm. Experimental results show that the SAS can consistently acquire soil samples with a mass of 50g at a depth of 200mm, while the Lab can process each sample within 10 minutes to accurately measure pH and macronutrients. These results demonstrate the potential of the system to provide farmers with timely, data-driven insights for more efficient and sustainable soil management and fertilizer application. I. INTRODUCTION Achieving sustainable agricultural resource management requires accurate, high-resolution, and up-to-date data on soil properties such as pH and macronutrients [1], [2]. However, conventional soil sampling and testing methods fail to address this need at scale.


Distributed Multi-robot Online Sampling with Budget Constraints

Shamshirgaran, Azin, Manjanna, Sandeep, Carpin, Stefano

arXiv.org Artificial Intelligence

In multi-robot informative path planning the problem is to find a route for each robot in a team to visit a set of locations that can provide the most useful data to reconstruct an unknown scalar field. In the budgeted version, each robot is subject to a travel budget limiting the distance it can travel. Our interest in this problem is motivated by applications in precision agriculture, where robots are used to collect measurements to estimate domain-relevant scalar parameters such as soil moisture or nitrates concentrations. In this paper, we propose an online, distributed multi-robot sampling algorithm based on Monte Carlo Tree Search (MCTS) where each robot iteratively selects the next sampling location through communication with other robots and considering its remaining budget. We evaluate our proposed method for varying team sizes and in different environments, and we compare our solution with four different baseline methods. Our experiments show that our solution outperforms the baselines when the budget is tight by collecting measurements leading to smaller reconstruction errors.


Automated Testing of Spatially-Dependent Environmental Hypotheses through Active Transfer Learning

Harrison, Nicholas, Wallace, Nathan, Sukkarieh, Salah

arXiv.org Artificial Intelligence

The efficient collection of samples is an important factor in outdoor information gathering applications on account of high sampling costs such as time, energy, and potential destruction to the environment. Utilization of available a-priori data can be a powerful tool for increasing efficiency. However, the relationships of this data with the quantity of interest are often not known ahead of time, limiting the ability to leverage this knowledge for improved planning efficiency. To this end, this work combines transfer learning and active learning through a Multi-Task Gaussian Process and an information-based objective function. Through this combination it can explore the space of hypothetical inter-quantity relationships and evaluate these hypotheses in real-time, allowing this new knowledge to be immediately exploited for future plans. The performance of the proposed method is evaluated against synthetic data and is shown to evaluate multiple hypotheses correctly. Its effectiveness is also demonstrated on real datasets. The technique is able to identify and leverage hypotheses which show a medium or strong correlation to reduce prediction error by a factor of 1.4--3.4 within the first 7 samples, and poor hypotheses are quickly identified and rejected eventually having no adverse effect.


Blind Polynomial Regression

Natali, Alberto, Leus, Geert

arXiv.org Artificial Intelligence

Fitting a polynomial to observed data is an ubiquitous task in many signal processing and machine learning tasks, such as interpolation and prediction. In that context, input and output pairs are available and the goal is to find the coefficients of the polynomial. However, in many applications, the input may be partially known or not known at all, rendering conventional regression approaches not applicable. In this paper, we formally state the (potentially partial) blind regression problem, illustrate some of its theoretical properties, and propose algorithmic approaches to solve it. As a case-study, we apply our methods to a jitter-correction problem and corroborate its performance.


Optimizing Bayesian acquisition functions in Gaussian Processes

Pawar, Ashish Anil, Warbhe, Ujwal

arXiv.org Machine Learning

Bayesian optimization is a popular optimization technique for optimizing a black box function especially with high dimensions. For a known objective functions, various optimization functions are readily available to choose from. For a black box function, since the true nature of the objective function is unknown, many available optimization techniques including Gradient Descent cannot be applied. For a black box function, various other optimization techniques are available such as Grid Search and Random Search, however, both of these techniques are extremely inefficient and time consuming specially if the objective function is costly to execute. Instead, Bayesian optimization tries to find the global optimum by using a surrogate function to evaluate the real objective function, thus, making the computation much efficient with respect to time or money.


Gaussian Mixture Estimation from Weighted Samples

Frisch, Daniel, Hanebeck, Uwe D.

arXiv.org Machine Learning

Given a set of samples, the parameters of a GM are determined in such a way as to best fit the samples in a maximum likelihood way. Solutions for equally weighted samples are readily available, expectation-maximization (EM) based methods being the most prevalent because of low computational requirements and ease of implementation. So it comes as a surprise that GM estimation for weighted samples is hard to find in literature. It might be even more surprising that the standard reference [1] gives incorrect results, see Figure 1. 2. Context Applications for sample-to-density function approximation include clustering of unlabled data [2, 3], multi-target tracking [4, 5], group tracking [6], multilateration [7, 8], and arbitrary density representation in nonlinear filters [9, 10]. A popular basic solution to this is the k-means algorithm. It does not find a complete density representation, only the means of the individual clusters. The k-means algorithm uses hard sample-tomean associations, therefore yields merely approximate solutions but can be computationally optimized using k-d trees [11, 12]. Moreover, the global optimum can be found deterministically [13], therefore it can be used to provide an initial guess for more elaborate algorithms. A sample-to-density approximation that is optimal in a maximum likelihood sense can be searched with numerical optimization techniques such as the Newton algorithm that has quadratic convergence but high computational demand per iteration, quasi-Newton methods, the method of scoring, or the conjugate gradient method with slower convergence but less computational effort per iteration [14].


Distance-Penalized Active Learning Using Quantile Search

Lipor, John, Wong, Brandon, Scavia, Donald, Kerkez, Branko, Balzano, Laura

arXiv.org Machine Learning

Adaptive sampling theory has shown that, with proper assumptions on the signal class, algorithms exist to reconstruct a signal in $\mathbb{R}^{d}$ with an optimal number of samples. We generalize this problem to the case of spatial signals, where the sampling cost is a function of both the number of samples taken and the distance traveled during estimation. This is motivated by our work studying regions of low oxygen concentration in the Great Lakes. We show that for one-dimensional threshold classifiers, a tradeoff between the number of samples taken and distance traveled can be achieved using a generalization of binary search, which we refer to as quantile search. We characterize both the estimation error after a fixed number of samples and the distance traveled in the noiseless case, as well as the estimation error in the case of noisy measurements. We illustrate our results in both simulations and experiments and show that our method outperforms existing algorithms in the majority of practical scenarios.


Optimally-Weighted Herding is Bayesian Quadrature

Huszár, Ferenc, Duvenaud, David

arXiv.org Machine Learning

Herding and kernel herding are deterministic methods of choosing samples which summarise a probability distribution. A related task is choosing samples for estimating integrals using Bayesian quadrature. We show that the criterion minimised when selecting samples in kernel herding is equivalent to the posterior variance in Bayesian quadrature. We then show that sequential Bayesian quadrature can be viewed as a weighted version of kernel herding which achieves performance superior to any other weighted herding method. We demonstrate empirically a rate of convergence faster than O(1/N). Our results also imply an upper bound on the empirical error of the Bayesian quadrature estimate.


Active Learning for Function Approximation

Sung, Kah Kay, Niyogi, Partha

Neural Information Processing Systems

We develop a principled strategy to sample a function optimally for function approximation tasks within a Bayesian framework. Using ideas from optimal experiment design, we introduce an objective function (incorporating both bias and variance) to measure the degree of approximation, and the potential utility of the data points towards optimizing this objective. We show how the general strategy can be used to derive precise algorithms to select data for two cases: learning unit step functions and polynomial functions. In particular, we investigate whether such active algorithms can learn the target with fewer examples. We obtain theoretical and empirical results to suggest that this is the case. 1 INTRODUCTION AND MOTIVATION Learning from examples is a common supervised learning paradigm that hypothesizes a target concept given a stream of training examples that describes the concept. In function approximation, example-based learning can be formulated as synthesizing an approximation function for data sampled from an unknown target function (Poggio and Girosi, 1990). Active learning describes a class of example-based learning paradigms that seeks out new training examples from specific regions of the input space, instead of passively accepting examples from some data generating source.